A Clustered Approach to Multithreaded Processors
نویسندگان
چکیده
With aggressive superscalar processors delivering diminishing returns, alternate designs that make good use of the increasing chip densities are actively being explored. One such approach is simultaneous multithreading (SMT), where a conventional superscalar supports multiple threads such that instructions from different threads may be issued in a single cycle. Another approach is the on-chip multiprocessor and its variants. Unlike the SMT approach, all the resources have fixed assignment (FA) in this architecture. The design simplicity of the FA approach enables high clock frequencies, while the flexibility of the SMT approach allows it to adapt to the specific threadand instruction-level parallelism of the application. Unfortunately, the strict partitioning of resources among various processors in the FA architecture may result in under-utilization of the chip, while the fully centralized structure of the SMT may result in a longer clock cycle-time. In this paper, we explore a hybrid design, where a chip is composed of a set of SMT processors. We evaluate such a clustered architecture running parallel applications. We consider both a lowend machine with only one processor chip on which to run multiple threads as well as a high-end machine with several processor chips working on the same application. Overall, we conclude that such a hybrid processor represents a good performance-complexity design point.
منابع مشابه
Measuring the Performance of Multithreaded Processors
Nowadays, multithreaded architectures are becoming more and more popular. In fact, many processor vendors have already shipped processors with multithreaded features. Regardless of this push on multithreaded processors, still today there is not a clear procedure that defines how to measure the behavior of a multithreaded processor. This paper presents FAME, a new evaluation methodology aimed to...
متن کاملWorst-Case Execution Time Estimation for Hardware-assisted Multithreaded Processors
This paper introduces a method for bounding the worst-case performance of programs running on multithreaded processors, such as the embedded cores found within network processors (NPs). Worst-case bounds can be useful in determining whether a given software implementation will provide stable (e.g., line rate) performance under all traffic conditions. Our method extends an approach from the real...
متن کاملOperating System Scheduling for Chip Multithreaded Processors
This dissertation addresses operating system thread scheduling for chip multithreaded processors. Chip multithreaded processors are becoming mainstream thanks to their superior performance and power characteristics. Threads running concurrently on a chip multithreaded processor share the processor’s resources. Resource contention, and accordingly performance, depends on characteristics of the c...
متن کاملExploiting Thread-Level Parallelism on Simultaneous Multithreaded Processors
Exploiting Thread-Level Parallelism on Simultaneous Multithreaded Processors
متن کاملBus Utilization Analysis of Multithreaded Shared-bus Multiprocessors:initial Results
A shared-bus shared-memory multiprocessor based on multithreaded CPUs is evaluated against different solutions for cache and coherence protocols. Multithreaded architectures have been intensively studied for DSM multiprocessors, where memory latencies are a major factor in limiting performance. They can be interesting also for bus-based multiprocessors, since processor speed are increasing at a...
متن کامل